Overview

Dataset statistics

Number of variables9
Number of observations1030
Missing cells0
Missing cells (%)0.0%
Duplicate rows11
Duplicate rows (%)1.1%
Total size in memory72.5 KiB
Average record size in memory72.1 B

Variable types

Numeric9

Alerts

Dataset has 11 (1.1%) duplicate rowsDuplicates
water is highly correlated with superplasticizerHigh correlation
superplasticizer is highly correlated with waterHigh correlation
age is highly correlated with concrete_compressive_strengthHigh correlation
concrete_compressive_strength is highly correlated with ageHigh correlation
water is highly correlated with superplasticizerHigh correlation
superplasticizer is highly correlated with waterHigh correlation
cement is highly correlated with fly_ashHigh correlation
fly_ash is highly correlated with cementHigh correlation
water is highly correlated with superplasticizerHigh correlation
superplasticizer is highly correlated with waterHigh correlation
cement is highly correlated with blast_furnace_slag and 6 other fieldsHigh correlation
blast_furnace_slag is highly correlated with cement and 5 other fieldsHigh correlation
fly_ash is highly correlated with cement and 5 other fieldsHigh correlation
water is highly correlated with cement and 5 other fieldsHigh correlation
superplasticizer is highly correlated with cement and 5 other fieldsHigh correlation
coarse_aggregate is highly correlated with cement and 5 other fieldsHigh correlation
fine_aggregate is highly correlated with cement and 5 other fieldsHigh correlation
concrete_compressive_strength is highly correlated with cementHigh correlation
blast_furnace_slag has 471 (45.7%) zeros Zeros
fly_ash has 566 (55.0%) zeros Zeros
superplasticizer has 379 (36.8%) zeros Zeros

Reproduction

Analysis started2022-08-30 17:40:42.319225
Analysis finished2022-08-30 17:40:55.940947
Duration13.62 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

cement
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct278
Distinct (%)27.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean281.1678641
Minimum102
Maximum540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:56.049696image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile143.745
Q1192.375
median272.9
Q3350
95-th percentile480
Maximum540
Range438
Interquartile range (IQR)157.625

Descriptive statistics

Standard deviation104.5063645
Coefficient of variation (CV)0.3716867318
Kurtosis-0.5206522845
Mean281.1678641
Median Absolute Deviation (MAD)79.4
Skewness0.5094811789
Sum289602.9
Variance10921.58022
MonotonicityNot monotonic
2022-08-30T23:10:56.189995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
362.620
 
1.9%
42520
 
1.9%
251.415
 
1.5%
31014
 
1.4%
44614
 
1.4%
33113
 
1.3%
47513
 
1.3%
25013
 
1.3%
34912
 
1.2%
38712
 
1.2%
Other values (268)884
85.8%
ValueCountFrequency (%)
1024
0.4%
108.34
0.4%
1164
0.4%
122.64
0.4%
1322
 
0.2%
1335
0.5%
133.11
 
0.1%
134.71
 
0.1%
1352
 
0.2%
135.72
 
0.2%
ValueCountFrequency (%)
5409
0.9%
531.35
0.5%
5281
 
0.1%
5257
0.7%
5222
 
0.2%
5202
 
0.2%
5162
 
0.2%
5051
 
0.1%
500.11
 
0.1%
50010
1.0%

blast_furnace_slag
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct185
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.89582524
Minimum0
Maximum359.4
Zeros471
Zeros (%)45.7%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:56.344715image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median22
Q3142.95
95-th percentile236
Maximum359.4
Range359.4
Interquartile range (IQR)142.95

Descriptive statistics

Standard deviation86.27934175
Coefficient of variation (CV)1.167580732
Kurtosis-0.5081754789
Mean73.89582524
Median Absolute Deviation (MAD)22
Skewness0.8007168956
Sum76112.7
Variance7444.124812
MonotonicityNot monotonic
2022-08-30T23:10:56.519537image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0471
45.7%
18930
 
2.9%
106.320
 
1.9%
2414
 
1.4%
2012
 
1.2%
14511
 
1.1%
98.110
 
1.0%
1910
 
1.0%
268
 
0.8%
228
 
0.8%
Other values (175)436
42.3%
ValueCountFrequency (%)
0471
45.7%
114
 
0.4%
13.65
 
0.5%
155
 
0.5%
17.21
 
0.1%
17.51
 
0.1%
17.61
 
0.1%
1910
 
1.0%
2012
 
1.2%
228
 
0.8%
ValueCountFrequency (%)
359.42
 
0.2%
342.12
 
0.2%
316.12
 
0.2%
305.34
0.4%
290.22
 
0.2%
2884
0.4%
282.84
0.4%
272.82
 
0.2%
262.25
0.5%
2601
 
0.1%

fly_ash
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct156
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.18834951
Minimum0
Maximum200.1
Zeros566
Zeros (%)55.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:56.659480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3118.3
95-th percentile167
Maximum200.1
Range200.1
Interquartile range (IQR)118.3

Descriptive statistics

Standard deviation63.99700415
Coefficient of variation (CV)1.181010397
Kurtosis-1.328746435
Mean54.18834951
Median Absolute Deviation (MAD)0
Skewness0.5373539058
Sum55814
Variance4095.616541
MonotonicityNot monotonic
2022-08-30T23:10:56.777102image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0566
55.0%
118.320
 
1.9%
14116
 
1.6%
24.515
 
1.5%
7914
 
1.4%
9413
 
1.3%
100.411
 
1.1%
125.210
 
1.0%
95.710
 
1.0%
98.810
 
1.0%
Other values (146)345
33.5%
ValueCountFrequency (%)
0566
55.0%
24.515
 
1.5%
591
 
0.1%
601
 
0.1%
711
 
0.1%
71.51
 
0.1%
75.61
 
0.1%
761
 
0.1%
772
 
0.2%
782
 
0.2%
ValueCountFrequency (%)
200.11
 
0.1%
2001
 
0.1%
1953
0.3%
194.91
 
0.1%
1941
 
0.1%
1931
 
0.1%
1901
 
0.1%
1871
 
0.1%
185.31
 
0.1%
1852
0.2%

water
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct195
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean181.5672816
Minimum121.8
Maximum247
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:56.908303image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum121.8
5-th percentile146.1
Q1164.9
median185
Q3192
95-th percentile228
Maximum247
Range125.2
Interquartile range (IQR)27.1

Descriptive statistics

Standard deviation21.35421857
Coefficient of variation (CV)0.1176104989
Kurtosis0.1220816744
Mean181.5672816
Median Absolute Deviation (MAD)13
Skewness0.07462838429
Sum187014.3
Variance456.0026505
MonotonicityNot monotonic
2022-08-30T23:10:57.036649image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
192118
 
11.5%
22854
 
5.2%
185.746
 
4.5%
203.536
 
3.5%
18628
 
2.7%
164.920
 
1.9%
16220
 
1.9%
18515
 
1.5%
153.515
 
1.5%
20014
 
1.4%
Other values (185)664
64.5%
ValueCountFrequency (%)
121.85
0.5%
126.65
0.5%
1271
 
0.1%
127.31
 
0.1%
137.85
0.5%
1401
 
0.1%
140.85
0.5%
141.85
0.5%
1421
 
0.1%
143.35
0.5%
ValueCountFrequency (%)
2471
 
0.1%
246.91
 
0.1%
2371
 
0.1%
236.71
 
0.1%
22854
5.2%
221.41
 
0.1%
2212
 
0.2%
220.11
 
0.1%
2202
 
0.2%
219.71
 
0.1%

superplasticizer
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct111
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.204660194
Minimum0
Maximum32.2
Zeros379
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:57.167473image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median6.4
Q310.2
95-th percentile16.055
Maximum32.2
Range32.2
Interquartile range (IQR)10.2

Descriptive statistics

Standard deviation5.973841392
Coefficient of variation (CV)0.9627991228
Kurtosis1.411268965
Mean6.204660194
Median Absolute Deviation (MAD)5.3
Skewness0.9072025749
Sum6390.8
Variance35.68678098
MonotonicityNot monotonic
2022-08-30T23:10:57.290826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0379
36.8%
11.637
 
3.6%
827
 
2.6%
719
 
1.8%
617
 
1.7%
9.916
 
1.6%
8.916
 
1.6%
7.816
 
1.6%
916
 
1.6%
1015
 
1.5%
Other values (101)472
45.8%
ValueCountFrequency (%)
0379
36.8%
1.74
 
0.4%
1.91
 
0.1%
21
 
0.1%
2.21
 
0.1%
2.52
 
0.2%
36
 
0.6%
3.11
 
0.1%
3.43
 
0.3%
3.65
 
0.5%
ValueCountFrequency (%)
32.25
0.5%
28.25
0.5%
23.45
0.5%
22.11
 
0.1%
226
0.6%
20.81
 
0.1%
201
 
0.1%
191
 
0.1%
18.81
 
0.1%
18.65
0.5%

coarse_aggregate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct284
Distinct (%)27.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean972.918932
Minimum801
Maximum1145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:57.432579image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum801
5-th percentile842
Q1932
median968
Q31029.4
95-th percentile1104
Maximum1145
Range344
Interquartile range (IQR)97.4

Descriptive statistics

Standard deviation77.75395397
Coefficient of variation (CV)0.07991822485
Kurtosis-0.5990161032
Mean972.918932
Median Absolute Deviation (MAD)46.3
Skewness-0.04021974481
Sum1002106.5
Variance6045.677357
MonotonicityNot monotonic
2022-08-30T23:10:57.720579image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93257
 
5.5%
852.145
 
4.4%
944.730
 
2.9%
96829
 
2.8%
112524
 
2.3%
104719
 
1.8%
96719
 
1.8%
97412
 
1.2%
94212
 
1.2%
93812
 
1.2%
Other values (274)771
74.9%
ValueCountFrequency (%)
8014
0.4%
801.11
 
0.1%
801.41
 
0.1%
8112
0.2%
8141
 
0.1%
814.11
 
0.1%
817.91
 
0.1%
8181
 
0.1%
8192
0.2%
819.21
 
0.1%
ValueCountFrequency (%)
11451
 
0.1%
1134.35
 
0.5%
11301
 
0.1%
112524
2.3%
1124.42
 
0.2%
11202
 
0.2%
11192
 
0.2%
1118.82
 
0.2%
11181
 
0.1%
11132
 
0.2%

fine_aggregate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct302
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean773.5804854
Minimum594
Maximum992.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:58.150579image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum594
5-th percentile613
Q1730.95
median779.5
Q3824
95-th percentile898.09
Maximum992.6
Range398.6
Interquartile range (IQR)93.05

Descriptive statistics

Standard deviation80.17598014
Coefficient of variation (CV)0.1036427129
Kurtosis-0.1021769893
Mean773.5804854
Median Absolute Deviation (MAD)45.5
Skewness-0.2530095977
Sum796787.9
Variance6428.187792
MonotonicityNot monotonic
2022-08-30T23:10:58.562125image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
755.830
 
2.9%
59430
 
2.9%
67023
 
2.2%
61322
 
2.1%
80116
 
1.6%
746.615
 
1.5%
887.115
 
1.5%
71214
 
1.4%
84514
 
1.4%
75012
 
1.2%
Other values (292)839
81.5%
ValueCountFrequency (%)
59430
2.9%
6055
 
0.5%
611.85
 
0.5%
6121
 
0.1%
61322
2.1%
613.22
 
0.2%
6141
 
0.1%
6232
 
0.2%
6305
 
0.5%
6314
 
0.4%
ValueCountFrequency (%)
992.65
0.5%
9454
0.4%
943.14
0.4%
9424
0.4%
925.75
0.5%
905.95
0.5%
903.85
0.5%
903.65
0.5%
901.85
0.5%
900.95
0.5%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45.66213592
Minimum1
Maximum365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:58.919127image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q17
median28
Q356
95-th percentile180
Maximum365
Range364
Interquartile range (IQR)49

Descriptive statistics

Standard deviation63.16991158
Coefficient of variation (CV)1.383419989
Kurtosis12.16898898
Mean45.66213592
Median Absolute Deviation (MAD)21
Skewness3.269177401
Sum47032
Variance3990.437729
MonotonicityNot monotonic
2022-08-30T23:10:59.247677image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
28425
41.3%
3134
 
13.0%
7126
 
12.2%
5691
 
8.8%
1462
 
6.0%
9054
 
5.2%
10052
 
5.0%
18026
 
2.5%
9122
 
2.1%
36514
 
1.4%
Other values (4)24
 
2.3%
ValueCountFrequency (%)
12
 
0.2%
3134
 
13.0%
7126
 
12.2%
1462
 
6.0%
28425
41.3%
5691
 
8.8%
9054
 
5.2%
9122
 
2.1%
10052
 
5.0%
1203
 
0.3%
ValueCountFrequency (%)
36514
 
1.4%
3606
 
0.6%
27013
 
1.3%
18026
 
2.5%
1203
 
0.3%
10052
 
5.0%
9122
 
2.1%
9054
 
5.2%
5691
 
8.8%
28425
41.3%

concrete_compressive_strength
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct845
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.81796117
Minimum2.33
Maximum82.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2022-08-30T23:10:59.390995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2.33
5-th percentile10.961
Q123.71
median34.445
Q346.135
95-th percentile66.802
Maximum82.6
Range80.27
Interquartile range (IQR)22.425

Descriptive statistics

Standard deviation16.70574196
Coefficient of variation (CV)0.4664068366
Kurtosis-0.3137248604
Mean35.81796117
Median Absolute Deviation (MAD)10.93
Skewness0.4169772884
Sum36892.5
Variance279.0818145
MonotonicityNot monotonic
2022-08-30T23:10:59.731996image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.46
 
0.6%
77.34
 
0.4%
79.34
 
0.4%
31.354
 
0.4%
71.34
 
0.4%
35.34
 
0.4%
23.524
 
0.4%
41.054
 
0.4%
44.283
 
0.3%
41.543
 
0.3%
Other values (835)990
96.1%
ValueCountFrequency (%)
2.331
0.1%
3.321
0.1%
4.571
0.1%
4.781
0.1%
4.831
0.1%
4.91
0.1%
6.271
0.1%
6.281
0.1%
6.471
0.1%
6.811
0.1%
ValueCountFrequency (%)
82.61
 
0.1%
81.751
 
0.1%
80.21
 
0.1%
79.991
 
0.1%
79.41
 
0.1%
79.34
0.4%
78.81
 
0.1%
77.34
0.4%
76.81
 
0.1%
76.241
 
0.1%

Interactions

2022-08-30T23:10:54.404169image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:45.388494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.438968image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:47.535999image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.847369image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.148634image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.119004image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.256628image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.398588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.510105image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:45.574235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.540949image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:47.645521image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.058370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.246376image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.224187image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.369676image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.504839image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.625436image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:45.681410image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.659877image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:47.838480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.198404image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.353351image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.336890image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.486737image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.624171image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.741578image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:45.795453image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.774485image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.023899image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.326368image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.464119image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.451559image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.607441image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.734630image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.853808image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:45.884095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.884529image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.153432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.430204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.574178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.558188image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.732480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.847376image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.969000image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.005482image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.993941image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.297355image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.613045image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.682316image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.672186image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.881326image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.953813image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:55.208535image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.113105image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:47.170761image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.418522image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.793339image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.788940image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.793979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.019714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.065574image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:55.335190image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.222996image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:47.294185image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.558374image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:49.928401image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.899091image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.906361image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.140973image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.177052image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:55.451790image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:46.331994image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:47.411504image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:48.692622image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:50.034630image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:51.006467image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:52.124433image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:53.254524image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-08-30T23:10:54.288165image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2022-08-30T23:11:00.056714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-30T23:11:00.402717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-30T23:11:00.636978image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-30T23:11:00.899206image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-30T23:10:55.641919image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-30T23:10:55.847717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

cementblast_furnace_slagfly_ashwatersuperplasticizercoarse_aggregatefine_aggregateageconcrete_compressive_strength
0540.00.00.0162.02.51040.0676.02879.99
1540.00.00.0162.02.51055.0676.02861.89
2332.5142.50.0228.00.0932.0594.027040.27
3332.5142.50.0228.00.0932.0594.036541.05
4198.6132.40.0192.00.0978.4825.536044.30
5266.0114.00.0228.00.0932.0670.09047.03
6380.095.00.0228.00.0932.0594.036543.70
7380.095.00.0228.00.0932.0594.02836.45
8266.0114.00.0228.00.0932.0670.02845.85
9475.00.00.0228.00.0932.0594.02839.29

Last rows

cementblast_furnace_slagfly_ashwatersuperplasticizercoarse_aggregatefine_aggregateageconcrete_compressive_strength
1020288.4121.00.0177.47.0907.9829.52842.14
1021298.20.0107.0209.711.1879.6744.22831.88
1022264.5111.086.5195.55.9832.6790.42841.54
1023159.8250.00.0168.412.21049.3688.22839.46
1024166.0259.70.0183.212.7858.8826.82837.92
1025276.4116.090.3179.68.9870.1768.32844.28
1026322.20.0115.6196.010.4817.9813.42831.18
1027148.5139.4108.6192.76.1892.4780.02823.70
1028159.1186.70.0175.611.3989.6788.92832.77
1029260.9100.578.3200.68.6864.5761.52832.40

Duplicate rows

Most frequently occurring

cementblast_furnace_slagfly_ashwatersuperplasticizercoarse_aggregatefine_aggregateageconcrete_compressive_strength# duplicates
1362.6189.00.0164.911.6944.7755.8335.304
3362.6189.00.0164.911.6944.7755.82871.304
4362.6189.00.0164.911.6944.7755.85677.304
5362.6189.00.0164.911.6944.7755.89179.304
2362.6189.00.0164.911.6944.7755.8755.903
6425.0106.30.0153.516.5852.1887.1333.403
7425.0106.30.0153.516.5852.1887.1749.203
8425.0106.30.0153.516.5852.1887.12860.293
9425.0106.30.0153.516.5852.1887.15664.303
10425.0106.30.0153.516.5852.1887.19165.203